An empirical analysis of the probabilistic K-nearest neighbour classifier
نویسندگان
چکیده
The probabilistic nearest neighbour (PNN) method for pattern recognition was introduced to overcome a number of perceived shortcomings of the nearest neighbour (NN) classifiers namely the lack of any probabilistic semantics when making predictions of class membership. In addition the NN method possesses no inherent principled framework for inferring the number of neighbours, K, nor indeed associated parameters related to the chosen metric. Whilst the Bayesian inferential methodology underlying the PNN classifier undoubtedly overcomes these shortcomings there has been to date no extensive systematic study of the performance of the PNN method nor any comparison with the standard non-probabilistic approach. We address this issue by undertaking an extensive empirical study which highlights the essential characteristics of PNN when compared to a cross-validated K-NN.
منابع مشابه
Optimal weighted nearest neighbour classifiers
We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted k-nearest neighbour classifier depends asymptotically only on the dimension d of the feature vec...
متن کاملA Variable Metric Probabilistic k-Nearest-Neighbours Classifier
12:06 30th March 2004 Abstract The k -nearest neighbour (k -nn) model is a simple, popular classifier. Probabilistic k -nn is a more powerful variant in which the model is cast in a Bayesian framework using (reversible jump) Markov chain Monte Carlo methods to average out the uncertainy over the model parameters. The k -nn classifier depends crucially on the metric used to determine distances b...
متن کاملGenerating Estimates of Classification Confidence for a Case-Based Spam Filter
Producing estimates of classification confidence is surprisingly difficult. One might expect that classifiers that can produce numeric classification scores (e.g. k-Nearest Neighbour or Naive Bayes) could readily produce confidence estimates based on thresholds. In fact, this proves not to be the case, probably because these are not probabilistic classifiers in the strict sense. The numeric sco...
متن کاملAn experimental comparison of neural and statistical non-parametric algorithms for supervised classification of remote-sensing images
An experimental analysis of the use of different neural models for the supervised classification of multisensor remote-sensing data is presented. Three types of neural classifiers are considered: the Multilayer Perceptron, a kind of Structured Neural Network, proposed by the authors, that allows the interpretation of the network operation, and a Probabilistic Neural Network. Furthermore, the k-...
متن کاملScene Classification Via pLSA
Given a set of images of scenes containing multiple object categories (e.g. grass, roads, buildings) our objective is to discover these objects in each image in an unsupervised manner, and to use this object distribution to perform scene classification. We achieve this discovery using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature, here ap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 28 شماره
صفحات -
تاریخ انتشار 2007